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TUTORIAL 

Exposure-Response Modeling of Clinical End Points 
Using Latent Variable Indirect Response Models 

CHu 1 

Exposure-response modeling facilitates effective dosing regimen selection in clinical drug development, 
where the end points are often disease scores and not physiological variables. Appropriate models need 
to be consistent with pharmacology and identifiable from the time courses of available data. This article 
describes a general framework of applying mechanism-based models to various types of clinical end 
points. Placebo and drug model parameterization, interpretation, and assessment are discussed with a 
focus on the indirect response models. 
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BACKGROUND 

Clinical end points are often discrete, representing either 
the disease condition or its change from baseline. Dosing 
regimen selection in clinical drug development requires the 
understanding of the response time course as a function 
of the dose levels and frequencies. Decisions in phase lib 
and phase III often rely on limited information from earlier 
phases where few active regimens in addition to placebo 
have been tested. Even at the submission stage, dose 
justifications may still be necessary. A recently published 
scenario is briefly summarized below for illustration through- 
out this article. 



A MOTIVATING EXAMPLE 

Golimumab is a human immunoglobulin G1 kappa mono- 
clonal antibody that binds with high affinity to tumor necro- 
sis factor-a. Data from two phase III clinical trials were 
available with clinical end points as the 20, 50, and 70% 
improvement in the American College of Rheumatology 
disease severity criteria (ACR20, ACR50, and ACR70). 1 A 
total of 976 subjects received placebo, 2mg/kg q12-weeks, 
4mg/kg q12-weeks, or 2mg/kg q8-weeks golimumab (with 
loading doses and with and without methotrexate). Obser- 
vations of the end points and golimumab plasma con- 
centrations were collected approximately every 4 weeks 
until week 24 or 48. A population pharmacokinetic model 
was developed with individual empirical Bayes estimates 
obtained, allowing predictions of individual pharmacoki- 
netic profiles. More details can be found in the study by Hu 
er a/. 2 The objective was to predict the likely clinical end 
point response time course under potential alternative dos- 
ing regimens. 

The example represents a common scenario in clinical 
drug development where dosing projections are required. 
This typically requires predictive modeling that efficiently 
incorporates all available information. This article describes a 
modeling framework to support this objective and discusses 
model choices along with related model-building topics. 



CATEGORICAL END POINT MODELING 

It may be easy to think about modeling each of the relevant end 
points in the motivating example separately. However, since 
dichotomized data are less informative than continuous vari- 
ables, identifying meaningful and precise relationships is often 
challenging. Because ACR20, ACR50, and ACR70 indicate dif- 
ferent levels of the same disease improvement, they can be 
combined into one ordered categorical end point, called ACR, 
having four possible outcomes: ACR = 0, if achieving ACR70; 
ACR = 1, if achieving ACR50 but not ACR70; ACR = 2, if 
achieving ACR20 but not ACR50; and ACR = 3, if not achieving 
ACR20. The ordering is arbitrary (and the naming of catego- 
ries is shifted from Hu et al. 2 for convenience in this article.) 
Modeling a combined variable (ACR) usually achieves better 
analysis efficiency than separately modeling each level. 

More generally, assume that the clinical end point Z(f) takes 
possible response values k = 0, 1, m. To maintain con- 
sistency with major applications to date, Z(f) is assumed to 
indicate disease severity, with larger values of k correspond 
to worse disease conditions. Logistic and probit regressions, 
known to perform similarly, 3 are the standard statistical tech- 
niques to model ordered categorical variables. 4 They link the 
probabilities of achieving response level k to the predictor 
M{t) in the following form: 

prob [Z(i)<k] = h[p k -M(t)], k = 0,X ... m-1 

where /3 0 < j3, < ... <)3 m _ 1 are intercepts, and h(x) is a link 
function that restricts the probability between 0 and 1 . It is 
often written with the inverse link function as: 

/7- 1 {prob[z(f)</c]} = A-M(f) 

For logistic regression, h(x) = exp(x)/[1 + exp(x)], and lr\x) 
= log[.x/(1 - x)]. For probit regression, h(x) = <t>(x), where O 
is the cumulative distribution function of the standard normal 
distribution. 

In typical applications, the model predictor M(t) is assumed 
to be the same for all k. For logistic regression, this corre- 
sponds to the proportional odds assumption which states 
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that the cumulative odds ratio for any two values of the pre- 
dictors is constant across response categories. Statistical 
tests of this assumption exist but tend to have low power 4 
and therefore not commonly conducted in practice. In the 
(rare) events where notable misfits do occur, more complex 
structures could be considered. 5 

Between-subject variability 

In logistic and probit regression applications in pharmaco- 
metrics literature to date, typically only one between-subject 
variability (BSV) term has been applied on the intercept level, 
as follows: 

h- 1 {[prob[Z{t)<k]}=r l + l3 k -M{t) (1) 

with i] ~ N(0, of). In this form, i] is often interpreted as base- 
line variability. However, it actually represents the variability 
of the average overall time course. It is noted that BSV also 
represents within-subject correlations of the response. Mul- 
tiple sources of BSV are reasonable to expect but might not 
be supported by the data, likely because categorical data are 
less informative than continuous data. 

Predicting population mean response probability 

An important quantity for dosing regimen selection is the popu- 
lation mean response probability E{prob[Z(f) < k]} under the 
mixed-effect model in Eq. 1 .This is typically calculated by simu- 
lation as follows. For a given time point rand clinical trial popu- 
lation size n, draw a normal random variate 77. ~ N(0, ar^.Then, 
draw a uniform random variate u r (If additional BSV terms 
appear in M(f), a corresponding vector of random variates will 
also need to be drawn.) Finally, a clinical response Z is given 
value /csuch that u.falls within the interval {h{r\ + ji k -M{t)}, h{r\. 
+ /3 t+1 - /W(r)}), with the convention that Z.= 0 when u.< ^{77.+ ji 0 
- M(t)}, and Z = m when u > h{r\. + P m - M(f)}. After simulating 
responses Z v Z 2 , Z n , the response frequency of category k 
can be calculated as the counting frequency (# of Z = k)ln. A 
precise estimate of E{prob[Z(r) < k]} can be obtained by using 
a large n, say 100,000. 

Probit regression can, however, allow a convenient analyti- 
cal determination of the model-predicted average probabili- 
ties. With i] representing the only random effect in Eq. 1 , and 
let q k = P k - M(t) be the fixed-effect term, the expected prob- 
ability is given as follows 6 : 

E {prob [Z(f ) < k]} = ®(q k I VTT^) ( 2 ) 

EXPOSURE-RESPONSE MODELING 

Exposure-response (E-R) modeling can facilitate effective 
dosing decisions. 7 For categorical end point modeling, M(t) 
in Eq. 1 needs to be chosen as a function of drug exposure. 
It may be easy to consider linear functions of some observed 
systemic exposure measures, such as trough drug concen- 
trations, area under the curve (AUC), or log(AUC) at or prior 
to certain time points, e.g., week 24 in the motivating exam- 
ple. More mechanism-based approaches have also been 
used. Although exact distinctions could be difficult, some 
major characteristics of empirical and mechanism-based 
approaches are elaborated below. 



Empirical approaches 

Using observed exposure metrics such as AUC leads to 
direct correlations between observed end points and expo- 
sure measures, which may be the simplest and thus perhaps 
more often used approach in clinical development. Such 
correlations describe the data from which they are built, by 
default. However, as a general principle, data at hand are 
already known, thus the more important question is under 
what circumstances the model is likely to predict well. Under 
linear pharmacokinetics, such correlations could be used to 
determine dose-response relationships when the dosing fre- 
quency is fixed, with the observed exposure measures serv- 
ing as surrogates for the overall exposure profile. When the 
pharmacodynamics is fast relative to the pharmarcokinetic 
half-lives, the empirical correlation may be mechanistically 
interpretable. However, in the more common situations where 
there are delays between measured concentrations and 
effects, these correlations generally will not hold under differ- 
ent dosing frequencies or administrative routes. For example, 
in the above motivating example, relationships between the 
concentration and ACR20 at week 24 in principle will differ 
between the q8-weeks and the q12-weeks regimens. In addi- 
tion, the direct correlations typically may be built with only 
fractions of all observed E-R data, such as the measured 
concentrations and ACR assessments at week 24, again 
because the relationships likely will differ for different time 
points such as week 8 or week 16. In phase II, categorical 
response data are often too sparse to allow precise quantifi- 
cation of such relationships, e.g., the concentration-ACR20 
response at a particular time point. 

Likewise, the recently emerging Markov transition models, 
while having been shown to describe the data well, may have 
limited predictive ability. To realize this, it may help to consider 
that for continuous end points, why a current observation is typi- 
cally not modeled based on a previous one. From a mechanistic 
viewpoint, responses are determined by the pharmacodynam- 
ics characteristics and the exposure time course. These may 
not be fully represented by the immediate past responder/non- 
responder status and the current drug concentration level. In the 
motivating example, the ACR20 response at week 24 should 
be determined by the exposure time course between weeks 0 
and 24 and the pharmacodynamic delay characteristics, which 
may not be fully represented by the ACR20 responder status at 
week 20 and the drug concentration at week 24. 

Mechanism-based approaches 

Mechanism-based models extrapolate better across varieties 
of scenarios and can be built using all available data. With 
limited data in practice, care should be taken to include the 
crucial features that are supportable. As a general principle, 
the modeling approach used should be determined in con- 
junction with the intended use of the model. In practice, the 
least empirical or most interpretable, flexible model that can 
be supported by the data should typically be used. 

Dosing selection generally requires the models to extrapolate 
across different dosing frequencies and administrative routes. 
This typically requires modeling the delay between systemic 
exposure and clinical response, along with the placebo effect. 
These features ideally should be incorporated in a principled 
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and natural way. A class of E-R models suitable for this purpose 
is the widely used types I — IV indirect response (IDR) models. 8 
For clinical end points, more complex models are not likely to 
be effectively estimated in late developmental stages. 

Indirect response models 

IDR models may be generally described as follows 8 : let the 
response be R = R(f), with baseline f?(0) = R Q , then: 

dfl/df = k m [1 + H, (f) ] - /c out [ 1 + H 2 (f)] R, R Q = kJk out (3) 

where ^(f) and H 2 (f) take the form of E max C(f)/[C 50 + C(f)] 
where C(f) is drug concentration and C 50 and E max are model 
parameters. For type I and type III IDR models, E max = -/ max or 
S max represents the maximum inhibitory or stimulation effect, 
respectively, and H 2 (t) = 0. For type II and type IV IDR models, 
E = -/ or S , respectively, and HAt) = 0. Model equa- 

max max max' r J ' 1 X ' ^ 

tions for various baseline-normalized IDR models have been 
derived, including reduction-from-baseline (R 0 -R) and ratio- 
of-baseline (R/R 0 ). 29 They allow the applications of alterna- 
tively parameterized IDR models to clinical end points. 2 

Although originally designed for physiological variables, 
the IDR models may be viewed as collapsed versions of 
more complex mechanistic models. 3 This important fact facili- 
tates the interpretation of IDR models in a broad range of 
applications and supports their predictive ability. 

LATENT VARIABLE REPRESENTATION 

The logistic/probit regression model in Eq. 1 may appear 
empirical, thus making it unclear how best to incorporate any 
mechanism-based models into the predictor M(f). For exam- 
ple, it may seem equally reasonable to use either M(f) = R(f) 
or another function such as M(f) = log[f?(f)], with R(f) being an 
IDR model. To select mechanism-consistent model forms, it is 
important to realize that logistic and probit regressions may be 
motivated by conceiving an unobservable, hence latent, physi- 
ological variable. The latent variable represents the underlying 
cause such that when it crosses certain thresholds, it causes 
the observed response to fall in the respective categories. 4 
Then, IDR models may be applied to the latent variable. 

To derive the latent variable representation, let /.(?) be the 
latent variable and {/5 k } be the thresholds such that 

Z(f)< k^L(t)<p k 

Assume that L(f) is modeled as 

L(t) = M(t) + ae (4) 

where M(f) is the model predictor, e ~ /V(0, o 2 ) follows the 
standard normal distribution, and cr is the error SD. For nota- 
tional convenience of this derivation, BSV are assumed to be 
contained in /W(f).Then, 

prob[Z(f)</c] = prob [/.(?)< ft] 
= prob [e <{p k - M (f )) / a] = £ [( ft - M [t )) / a] 

When the model specification for M(f) includes a multiplica- 
tive parameter, a will not be separately identifiable and may 
be assumed to equal 1 , and the above can be written as 



£- 1 {[prob[Z(f)</c]} = ft-M(f) 

which corresponds to probit regression, with <P~ 1 as the inverse 
of the cumulative normal distribution function. Logistic regres- 
sion in the same form may be similarly derived by assuming 
the logistic distribution instead of the normal. While standard 
in statistical theory, 4 in the context of applying IDR models, this 
derivation was first given by Hutmacher et al. 3 The derivation 
allows the interpretation of M(f) in Eq. 1 as a physiological vari- 
able, and consequently, the mechanistic consistency of apply- 
ing IDR models to M(t). 

Incorporating the placebo effect 

Clinical trials typically include placebo arms. The placebo 
treatment may also take the form of an active standard ther- 
apy, and the trial evaluates the add-on effect of the investiga- 
tional drug. If the placebo period is sufficiently long, the effect 
may attenuate from its peak. This could be due to multiple rea- 
sons, including regression to the mean. However, most often, 
the attenuation is not observed during the course of the trial, 
especially for large clinical trials. Placebo effect is typically 
modeled empirically and, for simplicity, is assumed benefi- 
cial and increasing-in-time in this article. The 1 -pathway and 
2-pathway approaches, elaborated below, have been used to 
incorporate placebo effect under IDR models. They differ in 
interpretation and ease of implementation. 

1- Pathway. This approach may be the easier one to conceive 
in order to incorporate placebo effect under the standard IDR 
model form of Eq. 3. It applies IDR models directly to M(t) 
and then models the placebo effect on a parameter, typically 
on k in , e.g., as k in (t) = k ln0 [1 - P max exp(-r p t)]. A latent variable 
approach in this form was applied to ACR20, ACR50, and 
ACR70 by Hu et al. 10 and subsequently used with a more 
complex physiological model by Ait-Oudhia et al." The pla- 
cebo model parameters cannot be separately estimated from 
placebo data alone. This may create difficulty for finding good 
initial parameter estimates as well as stable overall param- 
eter estimation. 

2- Pathway. This approach separates the placebo and drug 
effect terms. A convenient choice is: 

M(t) = f p (t) + f d (f) (5) 

where f (f) and f d (t) correspond to placebo and drug effect, 
respectively, and at baseline f p (0) = O.Then, IDR models can 
be used for f d (\). The ease of parameter estimation of the 
2-pathway approach may depend on the parameterization 
chosen, as shown below. 
Combining Eqs. 1 and 5 gives: 

r 1 {[prob[Z(f) < k]} =r} + $ k - f P {t) - U{t) < 6 ) 

Early applications substituted R(f) in Eq. 3 for f d (f) and directly 
estimated the model parameters in this form. 3 However, when 
no drug is given, f d (f) = f d (0) = R 0 > 0, therefore, f contains the 
drug effect as well as a component of baseline, and again, 
the placebo model parameters are not separately identifiable 
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from placebo data, as with the 1 -pathway approach. On the 
other hand, Eq. 6 could be rewritten, or reparameterized, as: 

/r 1 {[prob[Z(f) < k]} = r] + (p k -R 0 ) - f v (f) + [fl 0 - f A (f)] 

Writing a k = p k - R 0 , g p (t) = -f p (f), and letting R,(f) = [R 0 - f a (f)] 
be the corresponding reduction-from-baseline IDR model, 
then 

/r 1 {[prob[Z(f) < k]}= rj + a^ g p (t) + R,(t) (7) 

where g p (t) may be empirically modeled with g p (0) = 0. Fur- 
thermore, is shown by Hu era/. 2 to satisfy 

dfi 1 /df = -/c ln [l+ H,(t)] +/c out [1+ H 2 (0] 

(fl 0 - fl,), R 0 = fc ta //fou«, Bi(0) = 0 (8) 

The interpretation of the parameters in the reparameterized 
form Eqs. 7-8 can be compared with those in Eqs. 6 and 
3. Some remain the same and retain their interpretations; 
the BSV and placebo effect terms are identical, as are the 
IDR model parameters, including f? 0 which represents the 
baseline latent variable value via Eq. 5. However, the inter- 
pretation of the constant terms changes; while P k in Eq. 6 
represent latent variable thresholds, a k in Eq. 7 have the 
empirical interpretation as intercepts, representing baseline 
probabilities (along with BSV). When no drug is given, R^(t) = 
0 in Eq. 7, therefore, the remaining terms [r] + a k + g (f)] rep- 
resent placebo response so that the related parameters can 
be separately estimated from placebo data. This is advanta- 
geous for stable parameter estimation. 

Hu et a/. 21213 proved that Eqs. 7-8 were equivalent to the 
latent variable IDR model in their earlier applications. The ear- 
lier model 1314 was motivated somewhat heuristically to apply 
the IDR model in a way that allows the placebo effect param- 
eters to be separately estimated from placebo data. The 
proof showed that R 0 was equivalent to the maximum drug 
effect parameter in the earlier application when the maximum 
inhibitory or stimulation effect E max = 1 . Eqs. 7-8 also relate 
more directly to the standard IDR model forms than the ear- 
lier model. 13,14 Most importantly, the latent variable interpreta- 
tion of Eqs. 7-8 substantiates the approach with mechanistic 
consistency, thereby strengthening its applications. 

The mechanistic rationale of the 1 -pathway approach 
could be questionable, as it is often difficult to justify why pla- 
cebo (or the active control) would affect the same pathway. In 
contrast, under the 2-pathway approach, the total drug and 
placebo effect may be interpreted as the sum of those from 
the drug pathway and other nonspecific pathways. 15 Because 
of this and the earlier mentioned advantage of stable param- 
eter estimation, the 2-pathway approach will be the focus of 
the remainder of this article. 

CHANGE-FROM-BASELINE END POINTS 

Clinical end points may represent various types of change 
from baseline in disease conditions, such as the Likert scale 
measures for patient reported outcomes, DAS28 response 
criteria (defined according to the magnitude of DAS28 
improvement from baseline and by baseline DAS28 cat- 
egories) in rheumatoid arthritis, and ACR20, ACR50, and 
ACR70. 16 ' 17 The definitions of these end points suggest that 



they may be considered as arithmetic change, proportional 
change, and percent improvement from baseline in nature, 
respectively. These different natures conceivably could 
change how the underlying latent variables influence the 
clinical end points, thus affecting how they should be mod- 
eled. Addressing this issue requires latent variable model 
derivations according to each end point type, which are given 
in the Supplementary Appendix. The derivations show that 
the latent variable model in Eqs. 7-8 may suit these various 
types of change-from-baseline clinical end points, and thus 
retain their mechanistic consistency. 

PLACEBO MODEL INTERPRETATION AND CHOICE 

At baseline, g p (0) = R,(0) = 0 in Eq. 7. For probit regression, 
this implies q k = a k in Eq. 2, and 

E {prob[Z(0)} <k] = <b(a k I VW) 

Thus, the baseline probabilities can be directly computed 
from the intercepts along with the BSV variance, a fact that 
also holds for logistic regression although simulation calcula- 
tions are necessary. This fact allows approximate initial esti- 
mates of intercepts c^to be obtained from observed baseline 
response (by first selecting a reasonable initial estimate of co). 
Similarly, if g p (~) = P max , the term (a k - P m J represent aver- 
age steady-state placebo probabilities, thus an approximate 
initial estimate of P max may be conveniently obtained if the 
placebo response plateau is observed. 

It is noted that the placebo model parameters generally 
should still be jointly estimated with drug effect parameters 
from all data to gain estimation efficiency. Nevertheless, as 
mentioned earlier, obtaining these initial estimates is an 
advantage of the reduction-from-baseline IDR model param- 
eterization (Eqs. 7-8). In contrast, with the original parameter- 
ization (Eqs. 6 and 3), because a k = P k - R 0 , the parameters 
P k and R 0 cannot be separated. That is, the original param- 
eterization does not allow any subset of parameters to be 
estimated with a subset of the data. 

A convenient choice of the placebo model in Eq. 7 takes 
the exponential form: 

flUO = P«[1- «p(-r p f)] (9) 
where P is maximum effect and r is rate of onset. 

max p 

Alternative equivalent placebo model 

In case that the placebo model in Eq. 7 will plateau, it may be 
reparameterized as follows: let a k = a k + g p (°=), h p (t) = g p (f) 
- dpi 00 )' tnen a k + 0 p ( f ) = a k ' + ^ p (0i ancl Ec l- 7 becomes 

dH{[prob[ Z(t) < k]} =r 1 +a' k +h p {t) + R,{t) 

where a k are threshold parameters to be estimated and h p (f) 
is constrained with /i p (°°) = 0. This shows that in Eq. 7, the 
alternative constraint of g p (°°) = 0 may be taken instead of 
g p (0) = 0. A convenient choice in this form is 

9 P {t) = -P max exp(-r p f) ( 10 ) 

This reparameterization changes the interpretation of the 
threshold parameters (along with the BSV variance) from 
representing average probabilities at baseline to representing 
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average probabilities at steady state. This could lead to more 
stable parameter estimation, in case the plateau of placebo 
response is observed sufficiently long at steady state. 

Placebo model reduction 

For change-from-baseline end points, Hu et al. 2 proposed a 
reduced model restricting the baseline probability prob[Z(0) 
< A] = 0, in the following form: 

flf p (f) = log[l -exp(-r p f)] (11) 

which has one parameter less than Eq. 9 or Eq. 10. The moti- 
vation and interpretation are given in the Supplementary 
Appendix. The reduced model may be more preferable 
especially for small to moderate data sizes. The choice may 
ultimately depend on how much complexity the data can sup- 
port, i.e., goodness of fit. In the motivating example, observed 
data (Figure 1 of Hu et al. 2 ) indeed suggest that one param- 
eter may be sufficient to describe the placebo response time 
course. When in doubt, testing both Eq. 11 and Eq. 10 may 
be prudent. 

INDIRECT RESPONSE MODEL USAGE 

In the latent variable framework, the maximum inhibitory or 
stimulation effect E , i.e., -/ or S in the general IDR 

max' ' max max a 

model representation Eq. 3, is usually not separately esti- 
mable from the remaining IDR model parameters and should 
be set to 1 . Some earlier applications tested this explicitly. 2 A 
heuristic argument is given below. Consider a steady-state 
infusion that maintains concentration C(f) at C ss and the drug 
effect R,(tj at R s . Setting the left side of Eq. 8 equal to 0 
gives: 

+ W 1 (f)]=M 1 + H 40] ( R o- Flss) 
Solving this for f? ss gives: 

R ss = Ro-flo[l + H 1 (f)]/[l + H 2 (f)] 
= R 0 {1 - [1 + H 1 (f)]/[1+ H 2 (f)]> 
For type I and type III IDR models, this leads to 

R ss = R 0 [_-H-l (f )] = Ro^max I (^50 / C SS + 1) 

therefore, R 0 and E max are not separately identifiable. 
For type II and type IV IDR models, 

R ss = R 0 {1 - 1 / [1 + H 2 (f)]} = [R 0 H 2 (f)] / [1 + H 2 (f )] 

= fto^max / (CsoEmax / C ss + 1) 

so that R 0 and E max are indistinguishable and so are C 50 and 

EXTENSIONS TO CONTINUOUS END POINTS 

When the number of clinical response categories becomes 
large, the end point may often be treated as continuous. 
Existing applications typically apply IDR models directly to 
the end point, often using the 1 -pathway approach to model 
the placebo effect. 18 The difference between 1- and 2-path- 
way models have been discussed. 3 The latent variable model 
derivation may be extended to this case by treating the latent 



variable L(t) as the observed response Z(r). Specifically, com- 
bining Eqs. 4 and 5 and using a similar baseline reparameter- 
ization as in the derivation leading to Eq. 7 gives the following 
continuous end point analog: 

Z(f) = b + flf p (f) + fl,(f) +oe (12) 

where R : (t) satisfies Eq. 8, and BSV may be placed on 
more parameters in addition to baseline b. The placebo 
effect g p (t) may still be modeled with the exponential model 
Eq. 9. Eq. 12 is equivalent to some existing applications; 
however, the 2-pathway reduction-from-baseline model 
parameterization may still allow clarity in interpretation and 
stable estimation. 

Hutmacher et a/. 15 argued that the actual physiological 
mechanism may collapse to any of the four IDR models, and 
therefore, all IDR models should be tried. In the general set- 
ting of modeling continuous and ordered categorical clinical 
end points, Hu era/. 2 proved a symmetry between type I and 
type III IDR models when the link function is symmetric. The 
symmetry is motivated by the observation that drug effects on 
clinical end points can be modeled either as reducing harmful 
effects or as increasing beneficial effects. For example, in the 
motivation example, the drug effect can be modeled in either 
of the two approaches: (i) increasing the probability of achiev- 
ing ACR responses and (ii) reducing the probability of failing 
to achieve ACR responses. The symmetry states that using 
type l/lll IDR models in approach (i) is equivalent to using type 
1 1 l/l models in approach (ii) in that the two approaches will 
result in identical model predictions along with corresponding 
IDR model parameters. This result is general because most 
practically used link functions are symmetric, including the 
logit, probit (Eq. 7), and the identity or minus-identity (Eq. 12) 
links. Therefore, there are only three IDR models to be tried 
when modeling clinical end points. 

OTHER EXTENSIONS 

Two possible further applications are briefly described below. 
Lognormal error latent variable 

Conceivably, L(t) as the underlying physiological variable 
should be positive, which motivates the lognormal error 
model L(t) = M(t) exp(cre). Incorporating the placebo effect 
multiplicatively as M{t) = f p (f) r" d (f), a derivation similar to that 
of Eqs. 7-8 leads to: 

aH {prob [Z (f ) < k]} = n + a k + g p (f) - D e log [fl 2 (f )] (1 3) 

where D e is a model parameter, and R 2 (t) satisfies the ratio- 
of-baseline IDR model form 9 : 

dR 2 ldt = k 0M [1 + Hi (f) ] - /c out [ 1 + H 2 (t )] R 2 , R 2 (0) = 1 (14) 

A convenient choice for the placebo model gf p (t) is again 
Eq. 9. 

The lognormal error model Eqs. 13-14 could be consid- 
ered as inferior to the normal error model Eqs. 7-8 due 
to conceptual reasons. On the other hand, it is interest- 
ing to compare Eq. 13 with the earlier model used by Hu 
ef a/. 2 ' 12,13 Those were the same as Eq. 13 with the term 
DJog[R 2 (t)] replaced by a first-order Taylor expansion 
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DJ1 - R z (f)]. Therefore, models Eqs. 7-8 and Eqs. 13-14 
may be expected to perform similarly when [1 - R 2 (t)] is 
small, i.e., drug effect is substantial. The author's limited 
experience to date with data used in early applications 
was consistent with this notion. 2 ' 1314 Further confirmation 
is necessary. 

Correlation between multiple end points 

Clinical trials may have multiple end points, and the mag- 
nitude of raw correlation between the end points indicates 
the similarity level between the disease components that 
the end points are designed to measure. Jointly modeling 
the end points allows the identification of common model 
parameters and BSVs or, in the case of simultaneous mod- 
eling of continuous and categorical end points, a latent 
structure that can predict both the continuous and cat- 
egorical end points. After this, extra correlations between 
the end points unaccounted for by BSVs may still remain 
and can be by the correlations among (latent) variables. 
In principle, this improves overall estimation efficiency (C. 
Hu, P. Szapary, A. Mendelsohn, and H. Zhou, personal 
communication). 6 

The standard approach to model the extra correlation 
uses the multivariate normal distribution. Although the probit 
latent variable approach allows it, the distribution is not cur- 
rently available in the common E-R modeling software NON- 
MEM. This can however be implemented with a conditional 
approach that factorizing the joint likelihood as the product 
of the likelihood of one variable with the conditional likelihood 
of the other (C. Hu, P. Szapary, A. Mendelsohn, and H. Zhou, 
personal communication). The magnitude of extra correla- 
tion can suggest whether the marginal end point models suf- 
ficiently accounted for the information in the individual end 
point data, thus providing a diagnostic on the quality of the 
marginal models. 

ESTIMATION AND DIAGNOSTICS 

This section illustrates some related practical topics. The dis- 
cussion pertains to general pharmacometric applications of 
nonlinear mixed-effect models, unless explicitly restricted to 
latent variable models. 

Parameter estimation 

Eq. 1 falls under the class of nonlinear mixed-effect models, 
which require approximations for likelihood-based parameter 
estimation. E-R analyses often use the software NONMEM 
with the LAPLACIAN approximation option, although more 
advanced estimation options such as the stochastic approxi- 
mation expectation-maximization (SAEM) and importance 
sampling (IMP) have recently been included. 19 Hutmacher et 
al. e have investigated the influence of additional estimation 
methods, including Gaussian quadrature, with latent variable 
models. The advanced methods have better theoretical char- 
acteristics but are more complex and time consuming. Using 
the LAPLACIAN option may be fine when no obvious lack-of- 
fits are present. Investigating on advanced methods, even if 
applied only at the final stage for robustness check, could be 
helpful. A sample NONMEM code is given in Supplementary 
Data. 



Evaluating estimation uncertainty 

Estimation uncertainty is typically assessed with SEs of 
parameter estimates. Confidence intervals (CIs) are also 
commonly used. Statistical theory states that with large 
sample sizes, CIs may be computed from SEs, e.g., the 95% 
CI may be calculated from ±O>- 1 (0.975) * SE = ±1.96 * SE. 
This requires the so-called asymptotic normality assump- 
tion to hold, which is usually difficult to verify. In practice, it is 
not uncommon to see CIs so computed to be less than zero 
for parameters with positive ranges. Log transforming these 
parameters may be an easy way to improve the behavior of 
the CIs. Nevertheless, SEs for nonlinear mixed-effect models 
are known to be approximate; therefore, alternative assess- 
ments may be desirable. 

Bootstrap is currently the most computationally intensive 
and yet the most popular method used to evaluate model 
estimation. It is implemented by repeatedly resampling the 
subjects in the population with replacement and repeating 
the estimation step and then examining the distribution of 
the resulted parameter estimates. The percentiles (e.g., 2.5 
and 97.5%) of the distribution form natural CIs (e.g., 95%) 
of the original model parameter estimates. The essence of 
the methodology is to use resampled subject population to 
approximate the true population. 20 This concept may not have 
received due recognition of importance, since the appropri- 
ateness of resampled population is determined by the stratifi- 
cation variables, but these seem rarely exactly described and 
justified in typical bootstrap result reporting. In general, the 
true target population is specified not only by the subjects in 
the data but also by any covariates used to stratify the resa- 
mpling, e.g., studies (when data from multiple studies are 
pooled together) or body weight categories. For the bootstrap 
results to indicate the true estimation uncertainty, it is thus 
important to stratify according to study design factors (which 
typically should be the studies and the study-specified strati- 
fication variables) and to ensure sufficient number of subjects 
in each stratum. 1021 

Likelihood profiling is another method to obtain CIs and 
understand the skewness of the estimation uncertainty. It 
assesses one parameter at a time and can be motivated 
from likelihood ratio tests. The assessed parameter is fixed at 
different values adjacent to the original parameter estimate, 
with the likelihood function maximized with respect to the 
remaining model parameters. The resulted objective func- 
tion (-2 times the maximum log likelihood) values can then 
be numerically examined and plotted vs. the fixed values of 
the parameter assessed, and CIs may be obtained by finding 
the parameter values that correspond to the desired nominal 
changes of objective function values. This has been used in 
latent variable IDR model applications. 210 Without concerns 
about stratifications and yet typically requiring fewer model 
runs, likelihood profiling may be viewed as an efficient alter- 
native to bootstrap. 

Model diagnostics 

Few generally useful goodness-of-fits are available for cat- 
egorical data. The visual predictive check (VPC) 22 may be the 
most useful and can be effectively implemented by plotting 
the observed frequencies of response vs. prediction intervals 
(e.g., 90%) of the model. For binary data, the issue of grouping 
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the data by covariates has been discussed. 23 For E-R model- 
ing of clinical trial data, a natural way to group the responses 
is by study visit and treatment, 2 ' 10 ' 12-14 due to the importance 
of understanding the treatment response frequencies over 
planned time course in dosing regimen selection. 

Model validation 

Model validation may be a confusing term. In the previous 
decade, models were often claimed as "validated" after show- 
ing similarity between the means or medians of the distri- 
bution of bootstrapped parameter estimates and the original 
model parameter estimates, before Hu et a/. 10 proved that 
such similarity can be expected to regularly occur despite 
arbitrarily large biases. Although the general use of boot- 
strap, jackknife, and cross-validation approaches do have 
the potential, implementation in the mixed-effect modeling 
will require much more sophistication. 20 In a strict sense, 
model validation concerns with predictive ability which, to 
avoid subjectivity, could be assessed only with newly arrived 
data. Furthermore, because nonlinear mixed-effect models 
have multiple components, it is often unclear which ones 
should or could be validated for the particular application. For 
example, because the random-effect component (i.e., BSV) 
is not directly observed, it is usually difficult to validate the 
related components such as covariate models with sparse 
sampling due to the lack of sound practical metrics to evalu- 
ate the difference between model-predicted and "observed" 
parameters since empirical Bayes estimates are not mutually 
independent. 

For the purpose of dosing regimen decisions, the most 
important quantity is likely the time course of clinical 
response probabilities. Obtaining new data solely for the pur- 
pose of model validation may be difficult in early develop- 
ment, e.g., proof-of-concept stages. This should be possible 
in later development stages, where data from new trial(s) can 
be used to conduct external VPCs of the model developed. 14 
The performance, especially if the validation data contain 
new dosing regimens, can conveniently provide a natural and 
objective assessment of the predictive ability of the model. 

SUGGESTED LATENT VARIABLE MODEL DEVELOP- 
MENT STEPS 

Latent variable IDR model development may often be rela- 
tively straightforward using data from placebo-controlled clin- 
ical trials where drug concentrations and clinical responses 
are measured at multiple time points. Reasonable models 
could be developed with the following main steps: 

1 . Plot placebo data. Fit exponential model Eq. 9 to obtain 
initial estimates of placebo effects. 

2. Fit selected IDR model with Eqs. 7 and 8. 

3. Plot mean predictions vs. means of the data or examine 
VPC as diagnostics. 

The alternative placebo model Eq. 10 may be used instead 
of Eq. 9 if the placebo response time course appears to have 
plateaued. The reduced placebo model Eq. 11 can be used 
and tested in steps 1 and 2, when the response reflects 
change from baseline. All three types of IDR models (type I/ 
III, II, and IV) may also be tested. 15 If additional data become 



available after initial model development, external VPC can 
be performed for model validation before pooling in the new 
data and re-estimate the model. Additional assessment of 
estimation uncertainty, e.g., likelihood profile or bootstrap, 
may be conducted. 

As an illustration, the E-R model development of the moti- 
vating example is briefly summarized below. In step 1 , plotting 
the placebo data trend (Figure 1 of Hu era/. 2 ) suggested that 
1 -degree of freedom of the reduced model Eq. 1 1 could suf- 
fice and was confirmed by VPC of the initial placebo logistic 
regression model. Next, parameter estimates of a type I latent 
variable IDR model with Eqs. 7-8 were obtained. The VPC 
(Figure 2 of Hu era/. 2 ) showed reasonable model description 
of observed data. The more complex placebo effect model 
Eq. 9 did not significantly improve the fit. Comparing the initial 
placebo model and the final model parameter estimates and 
VPCs showed that the minor differences between param- 
eter estimates did not affect model description of placebo 
data. Finally, likelihood profile plots (Figure 4 of Hu et al. 2 ) 
showed reasonable precision of IDR model parameters but 
high uncertainties in some placebo effect parameters. This 
supported the importance of parsimony, perhaps especially 
for the placebo effect model, even in this case of large phase 
III trials with reasonable sample sizes. Finally, the model was 
used to simulate possible time courses of ACR response 
probabilities under multiple alternative dosing regimens (Fig- 
ure 5 of Hu era/. 2 ) for dosing justification. In contrast, plotting 
the response probabilities in observed trough concentration 
quartile ranges suggested difficulties in identifying meaning- 
ful relationships for the direct correlation approach. 2 



DISCUSSION 

A general latent variable framework is described for clinical 
end point modeling that allows the utilization of all observed 
E-R data and better predictive ability, in comparison with 
some more empirical approaches. The framework is shown 
to suit the major types of clinical end points and may motivate 
additional models. The baseline-normalized model parame- 
terization shows advantages in estimation stability. Extra-cor- 
relation between end points may be accommodated in this 
framework using probit regression. Logit regression may be 
equally effectively used for single end point modeling. 

Mechanism consistency allows effective predictions of 
different dosing regimen outcomes. The presentation here 
focused on IDR models, but models based on other mecha- 
nisms can be used if desired. Note that for the predictions 
to be robust, especially in phase II when data are relatively 
sparse, the models should be parsimonious and have prac- 
tically interpretable parameters. As attributed to Einstein, 
"everything should be made as simple as possible, but no 
simpler." For dosing regimen decisions, the important factors 
in E-R modeling typically are: baseline, placebo maximum 
effect and rate of onset, and drug maximum effect and rate 
of onset. These correspond exactly to the latent variable IDR 
model parameters, e.g., ({aj, P max , r, R 0 , and k o J in Eqs. 
7-9. In this sense, the latent variable IDR models may often 
be expected to be optimal for dosing regimen decisions in 
clinical drug development. Model building and assessment 
could often be relatively straightforward. 
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